Using NLP to build the hypertextuel network of a back-of-the-book index

نویسندگان

  • Touria Aït El Mekki
  • Adeline Nazarenko
چکیده

Relying on the idea that back-of-the-book indexes are traditional devices for navigation through large documents, we have developed a method to build a hypertextual network that helps the navigation in a document. Building such an hypertextual network requires selecting a list of descriptors, identifying the relevant text segments to associate with each descriptor and finally ranking the descriptors and reference segments by relevance order. We propose a specific document segmentation method and a relevance measure for information ranking. The algorithms are tested on 4 corpora (of different types and domains) without human intervention or any semantic knowledge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Short-term Prediction of Tehran Stock Exchange Price Index (TEPIX): Using Artificial Neural Network (ANN)

The main objective of this study is to find out whether an Artificial Neural Network (ANN) will be useful to predict stock market price, which is highly non-linear and uncertain. Specifically, this study will focus on forecasting TSE Price Index (TEPIX) as the most significant index of Iran Stock Market. Many data have been used as inputs to the network. These data are observations of 2000 day...

متن کامل

Comparison of Artificial Neural Network and Multiple Regression Analysis for Prediction of Fat Tail Weight of Sheep

A comparative study of artificial neural network (ANN) and multiple regression is made to predict the fat tail weight of Balouchi sheep from birth, weaning and finishing weights. A multilayer feed forward network with back propagation of error learning mechanism was used to predict the sheep body weight. The data (69 records) were randomly divided into two subsets. The first subset is the train...

متن کامل

Estimation of coal swelling index based on chemical properties of coal using artificial neural networks

Free swelling index (FSI) is an important parameter for cokeability and combustion of coals. In this research, the effects of chemical properties of coals on the coal free swelling index were studied by artificial neural network methods. The artificial neural networks (ANNs) method was used for 200 datasets to estimate the free swelling index value. In this investigation, ten input parameters ...

متن کامل

Investigating the Extent of Correspondence between the Persian Book Final Indices and ISO 999 and B.S. 3700 Standards: The Case of the Field of Library and Information Sciences

Background and Aim: The purpose of this study was to investigate the extent of observing the standards of indexing (ISO 999-1996, BS 3700) of Library and Information Sciences books. Method: The study used descriptive-analytical methodology and the population consisted of all the Persian books, written and translated, in the field of Library and Information Sciences published from 2006 to 2012 w...

متن کامل

ارزیابی نمایه کتاب‌های پزشکی فارسی منتشرشده در سال‌های 1389 تا 1393 بر اساس استاندارد ایزو 999: 1996

Introduction: As for the main role of the back of the book index (BOB Index) in retrieving the concept of a book, the aim of this study is to evaluate the BOB indexes of Persian medical books published from 1389 to 1393 according to ISO 999-1996. Methods: This study is a descriptive survey. The research population consisted of all first-published medical printed books in the mentioned period f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0609134  شماره 

صفحات  -

تاریخ انتشار 2006